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ABSTRACT 

This  paper  presents  research  results  dealing  with  power 
MOSFETs  (metal  oxide  semiconductor  field  effect  tran¬ 
sistor)  within  the  prognostics  and  health  management 
of  electronics.  Experimental  results  are  presented  for 
the  identification  of  the  on-resistance  as  a  precursor  to 
failure  of  devices  with  die-attach  degradation  as  a  fail¬ 
ure  mechanism.  Devices  are  aged  under  power  cycling 
in  order  to  trigger  die-attach  damage.  In  situ  measure¬ 
ments  of  key  electrical  and  thermal  parameters  are  col¬ 
lected  throughout  the  aging  process  and  further  used  for 
analysis  and  computation  of  the  on-resistance  parame¬ 
ter.  Experimental  results  show  that  the  devices  experi¬ 
ence  die-attach  damage  and  that  the  on-resistance  cap¬ 
tures  the  degradation  process  in  such  a  way  that  it  could 
be  used  for  the  development  of  prognostics  algorithms 
(data-driven  or  physics-based). 

1.  INTRODUCTION 

The  failure  of  electronic  devices  is  of  great  concern  for 
future  aircraft,  which  will  see  an  increase  in  number  of 
electronics  systems  for  drive  and  control  equipment  crit¬ 
ical  to  safety  throughout  the  aircraft.  This  paper  presents 
research  results  dealing  with  power  semiconductor  de¬ 
vices  within  the  prognostics  and  health  management  of 
electronics.  Gate  controlled  power  transistors  like  power 
MOSFETs  (metal  oxide  semiconductor  field  effect  tran¬ 
sistor)  are  power  semiconductor  devices  employed  in  a 
variety  of  switch  mode  power  supplies  and  electrical 
motor  drivers  where  high  frequency  switching  of  high 
power  signals  is  required. 

The  current  research  efforts  for  prognostics  and  health 
management  of  these  devices  focus  on  the  identification 
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of  failure  mechanisms  and  the  development  of  acceler¬ 
ated  aging  methodologies  and  systems  to  accelerate  the 
aging  process  of  test  devices  while  performing  in  situ 
measurements  of  key  electrical  and  thermal  parameters. 
Accelerated  aging  systems  allow  for  the  understanding 
of  the  effects  of  failure  mechanisms  and  the  identifica¬ 
tion  of  leading  indicators  of  failure  which  are  essential 
in  the  development  of  physics-based  degradation  mod¬ 
els  and  in  the  prediction  of  remaining  useful  life.  Some 
failure  mechanisms  of  power  transistors  are  related  to  the 
packaging  of  the  devices,  particularly  due  to  mechani¬ 
cal  stresses  caused  by  thermal  cycling.  Thermal  cycling, 
as  an  aging  methodology,  is  regularly  used  to  accelerate 
the  aging  of  the  devices  by  cycling  between  temperatures 
considerably  larger  than  those  seen  in  normal  operation. 
This  is  representative  of  the  way  these  devices  operate  in 
real  world  applications  and  is  the  methodology  used  to 
assess  reliability. 

This  works  presents  results  on  the  identification  of 
precursors  of  failure  in  regards  to  die-attach  damage  fail¬ 
ure  mechanisms.  An  accelerated  life  test  system  under 
power  cycling  is  used  to  induce  die-attach  degradation 
due  to  thermal  over  stress.  Power  cycling  as  a  means  to 
accelerate  die-attach  degradation  is  a  common  method¬ 
ology  established  within  electronics  reliability  testing. 

Die-attach  damage  as  a  failure  mechanism  is  sus¬ 
pected  to  be  due  to  mechanical  shear  stresses  generated 
within  the  interfaces  of  the  device  assembly.  The  de¬ 
vice  is  basically  a  bi-material  assembly  where  silicon  die 
is  attached  to  a  copper  substrate  using  lead-free  solder 
(die-attach);  see  Figure  5.  It  should  be  noted  that  silicon 
and  copper  have  a  large  variation  with  respect  to  their 
individual  linear  coefficients  of  thermal  expansion.  As 
a  result,  temperature  cycling  of  this  assembly  results  in 
a  series  of  stresses  created  at  the  interfaces  of  the  die- 
attach  material  with  the  die  and  the  substrate.  Under  cer¬ 
tain  conditions,  these  stresses  develop  into  cracks  and 
voids  in  the  die-attach  material  which  result  in  consid¬ 
erable  degradation  of  the  thermal  dissipation  capabilities 
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of  the  device  — due  to  a  reduction  in  the  area  of  con¬ 
tact  of  the  interfaces  diminishing  thermal  conduction — . 
Consequently,  the  device  is  forced  to  operate  at  higher 
temperatures  thereby  accelerating  the  degradation  pro¬ 
cess  even  further. 

The  drain-to-source  electrical  on-resistance  (MOS- 
FET  is  in  the  on  state)  Rds(oti)  is  a  key  parameter  of 
a  power  MOSFET.  This  parameter  is  identified  and  iso¬ 
lated  as  a  precursor  of  failure  of  the  die-attach  failure 
mechanism.  Rds(oti)  is  highly  dependent  upon  junction 
temperature  (Tj)  and  its  normalization  with  respect  to 
junction  temperature  will  provide  a  clear  window  in  the 
development  of  models  of  the  degradation  process  as  a 
function  of  observed  Rds{ou)  values  through  time. 

1.1  Related  work 

In  the  area  of  accelerated  aging  of  power  electronics, 
several  approaches  have  been  employed  in  terms  of  re¬ 
liability  studies.  One  such  method  involves  electrical 
pulsing  of  power  MOSFETs  under  controlled  temper¬ 
atures  to  cause  electro-thermal  fatigue  (Khong  et  al., 
2007,  2005).  These  experiments  simulate  stresses  and 
hence  the  accelerated  aging  conditions  typically  expe¬ 
rienced  by  automotive  components  with  current  levels 
of  120A  and  a  duty  cycle  of  5-10%.  The  experimen¬ 
tal  results  have  shown  that  the  accelerated  aging  leads 
to  an  increase  in  the  drain- source  on-resistance  of  the 
power  MOSFET.  This  increase  was  shown  to  be  a  result 
of  die  attach  de-lamination  and  bond- wire  cracking  at  the 
source  terminal. 

A  reliability  assessment  of  power  MOSFETs  un¬ 
der  high  temperatures  was  performed  by  the  authors 
in  (Dupont  et  al.,  2007)  where  the  devices  were  power 
cycled  with  a  drain  current  of  150 A  and  duty  cycle  of 
30%.  Junction  temperatures  up  to  175°C  were  reported 
in  this  work.  It  was  observed  that  when  the  change  in 
junction  temperature  (A Tj)  was  high,  it  resulted  in  high 
drain  to  source  leakage  current,  while  high  on-resistance 
due  to  bond  wire  cracking  was  observed  when  the  A  Tj 
was  low. 

Thermal  stress  and  electrical  stress  are  the  most  com¬ 
mon  accelerated  aging  methodologies.  Thermal  cycling 
and  chronic  temperature  overstress  are  prevalent  thermal 
stress  methods  where  the  devices  are  subjected  to  rapid 
changes  in  temperature  differentials  causing  thermal  ex¬ 
pansion  and  contraction.  The  most  common  mode  of 
failure  from  such  aging  methodologies  is  various  forms 
of  package  failure  such  as  die  solder  degradation  and 
wire  lift.  Experiments  on  MOSFETs  cycled  7000  times 
from  50°  C  to  100°  C  showed  void  formation  in  over  30% 
of  the  die-attach  (Katsis  &  Wyk,  2003).  Similar  re¬ 
sults  were  demonstrated  for  IGBTs  (insulated  gate  bipo¬ 
lar  transistor)  undergoing  power  cycling  (Morozumi  et 
al.,  2003;  Thebaud  et  al.,  2003;  Wu  et  al.,  1995);  these 
results  also  showed  the  occurrence  of  wire  lift  in  the  de¬ 
vices.  Another  form  of  thermal  overstress  involves  sub¬ 
jecting  devices  to  high  temperatures  for  extended  periods 
of  time.  This  type  of  aging  accelerates  time  dependent 
dielectric  breakdown  (TDDB)  (Stathis  et  al.,  2005)  and 
transistors  stressed  under  this  methodology  have  exhib¬ 
ited  temperature  dependent  lifetimes  (Reynolds,  1974). 

In  the  area  of  identifying  precursors  to  failure,  IGBTs 
aged  with  self  heating  have  shown  changes  in  current 
ringing  characteristics  during  switching  (Ginart  et  al., 


2008).  In  another  study  on  the  effects  of  electrical  stress 
on  power  MOSFETs,  high  bias  stress  was  applied  at  the 
gate  and  it  was  shown  that  with  gate  voltages  ranging 
from  88V  to  94V  and  the  drain  source  grounded  for  2 
hours  resulted  in  the  lowering  of  threshold  voltage  and 
mobility  reduction  (Stojadinovic  et  al.,  2005). 

2.  ACCELERATED  AGING  SYSTEM 

Accelerated  aging  — also  referred  to  as  accelerated  life 
testing —  plays  a  very  important  role  on  the  develop¬ 
ment  of  Prognostics  and  Health  Management  (PHM)  so¬ 
lutions  for  electronics  components  and  systems.  Accel¬ 
erated  life  testing  (ALT)  and  highly  accelerated  life  test¬ 
ing  (HALT)  are  methodologies  frequently  used  to  assess 
the  reliability  of  products.  Accelerated  life  testing  is  an 
essential  tool  in  reliability,  particularly  for  products  from 
which  their  expected  lifetime  is  in  the  order  of  thousands 
of  hours  like  electronics  components  and  systems.  In 
such  situations,  it  is  not  feasible  to  wait  for  devices  to 
fail  under  normal  operation  in  order  to  compute  a  time 
to  failure;  therefore,  ALT  methods  are  used  to  predict  re¬ 
liability  (Suhir,  2007).  Accelerated  life  testing  is  used 
in  the  reliability  field  to:  a)  run  devices  to  failure  and 
compute  mean  time  to  failure  — in  this  case  the  test¬ 
ing  is  destructive — ;  or  b)  for  qualifications  test  where 
the  device  is  tested  to  see  if  it  passes  the  test  (Suhir, 
2007).  The  development  of  prognostics  algorithms  faces 
the  same  constrains  as  reliability  in  the  sense  that  run 
to  failure  data  of  critical  electronics  systems  is  rarely  or 
never  available.  Furthermore,  prognostics  is  concerned 
not  only  with  time  to  failure  of  devices  but  with  the 
degradation  process  as  well.  Therefore,  it  is  necessary  to 
include  in  situ  measurements  of  key  output  variables  and 
observable  parameters  in  the  accelerated  aging  process 
in  order  to  develop  and  learn  failure  progression  models. 

Thermal,  electrical  and  mechanical  overstresses  are 
regularly  applied  to  accelerated  aging  of  electronics 
components  in  reliability  and  PHM.  The  overstress  is 
used  in  order  to  accelerate  the  aging  of  these  devices 
which  otherwise  could  take  several  years  to  fail.  This 
acceleration  is  important  as  it  allows  for  an  assessment 
of  the  component  health  in  a  considerably  reduced  life¬ 
time.  Accelerated  aging  plays  a  very  important  role  on 
the  development  of  PHM  solutions  for  electronics  com¬ 
ponents  and  systems. 

Thermal  cycling  and  chronic  temperature  overstress 
lead  to  thermo-mechanical  stresses  in  electronics  due  to 
mismatch  on  the  coefficient  of  thermal  expansion  of  the 
different  elements  in  the  component’s  packaged  struc¬ 
ture.  As  a  result,  thermal  cycling  is  among  the  most 
prevalent  accelerated  aging  methodologies  for  electron¬ 
ics.  Thermal  cycling  subjects  devices  to  rapid  changes 
in  temperature  causing  thermal  expansions  and  contrac¬ 
tions  which  generate  high  mechanical  stresses  in  the  in¬ 
terfaces  of  thermally  mismatched  materials.  It  is  a  regu¬ 
lar  practice  in  reliability  testing  to  use  an  environmental 
chamber  or  a  heat  plate  in  order  to  provide  direct  ther¬ 
mal  cycling  to  an  electronic  structure  while  not  applying 
any  electrical  power  to  these  devices.  In  our  aging  setup, 
we  use  indirect  thermal  cycling  for  accelerated  aging  of 
power  MOSFETs. In  the  aging  methodology  used  in  our 
work,  thermal  gradients  result  from  electrical  power  ap¬ 
plied  to  the  devices,  nothing  that  no  additional  external 
heat  sink  is  used  during  the  aging  process.  This  results 
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in  thermal  cycles  as  well,  but  not  exactly  the  same  type 
of  thermal  cycling  as  found  in  electronics  reliability  lit¬ 
erature. 

2.1  Accelerated  aging  system  description 

The  aging  system  used  for  this  experiments  is  described 
in  detail  in  (Sonnenfeld  et  al.,  2008).  This  system  al¬ 
lows  for  accelerated  aging  of  gate  controlled  power  tran¬ 
sistors  like  power  MOSFETs  and  IGBTs.  The  capabili¬ 
ties  of  this  system  include  in-situ  measurements,  differ¬ 
ent  types  of  stress  factors  that  accelerate  device  life  as 
well  as  custom  made  software  that  controls  the  exper¬ 
iment  and  logs  the  data  from  in-situ  measurements  for 
further  analysis.  A  high  level  block  diagram  is  presented 
in  Figure  1;  details  on  the  hardware  and  software  im¬ 
plementations  are  available  in  (Sonnenfeld  et  al.,  2008). 
In  terms  of  accelerated  life  testing,  the  system  can  ap¬ 
ply  different  stresses  like  thermal,  electrical  or  a  com¬ 
bination  of  both.  The  focus  here  is  on  thermal  cycling 
which  is  achieved  by  applying  power  cycling  to  the  de¬ 
vices  under  test.  The  system  allows  for  the  investigation 
of  different  failure  mechanisms  (intrinsic  and  extrinsic) 
like  dielectric  breakdown,  hot-carrier  injection,  electro¬ 
migration,  contact  migration,  wire  lift,  die-attach  degra¬ 
dation  and  package  delamination.  This  aging  system  was 
designed  based  on  the  work  of  Ginart  et  al.  (2008)  and 
it  has  been  used  in  the  aging  of  IGBTs  and  power  MOS¬ 
FETs  in  order  to  understand  failure  mechanisms,  identify 
precursors  of  failure  and  develop  degradations  models 
for  prognostics  and  health  management  of  these  devices 
(Celaya  et  al.,  2009;  Patil  et  al.,  2009;  Saha  et  al.,  2009; 
Sonnenfeld  et  al.,  2008). 


Figure  1:  High  level  diagram  of  the  accelerated  aging 
system. 


2.2  Aging  Experiments 

The  accelerated  aging  applied  to  the  devices  presented  in 
this  work  consists  of  thermal  cycling  to  accelerate  degra¬ 
dation  using  thermal  overstress.  Latch-up,  thermal  run¬ 
away,  or  failure  to  turn  ON  due  to  loss  of  gate  control 
were  the  failure  conditions  we  considered.  Thermal  cy¬ 
cles  were  induced  by  power  cycling  the  devices  without 
the  use  of  an  external  heat  sink.  This  greatly  reduced 
the  heat  dissipation  capabilities  of  the  devices,  allowing 
for  self  heating  of  the  device  during  the  power  switch¬ 
ing  operation.  The  device  case  temperature  was  mea¬ 
sured  and  controlled  variable  for  the  thermal  cycling  ap¬ 
plication.  Temperatures  were  measured  in  situ  using  a 
thermocouple  attached  to  the  flange  of  the  copper  case. 


To  enable  power  cycling,  the  applied  gate  voltage  was 
a  square  wave  signal  with  an  amplitude  of  ~15V,  a  fre¬ 
quency  of  lKHz  and  a  duty  cycle  of  40%.  Proper  am¬ 
plification  of  this  signal  ensured  that  enough  current  is 
available  to  charge  the  gate  of  the  MOSFET  at  the  se¬ 
lected  frequency  and  duty  cycles.  The  drain-source  was 
biased  at  4Vdc  and  a  resistive  load  of  0.2f2  was  used  on 
the  collector  side  output  of  the  device. 

Figure  2  shows  the  typical  response  to  the  square  wave 
control  signal  to  the  gate.  The  drain  current  Id  is  greater 
than  zero  once  the  device  is  in  the  ON  state  (figure  shows 
the  the  voltage  output  of  the  current  sensor  which  is  pro¬ 
portional  to  Id)-  As  a  direct  result,  the  drain  to  source 
voltage  Vds  drops.  This  plot  represents  the  high  speed  in 
situ  measurements  available  for  further  analysis.  These 
measurements  are  taken  only  once  the  device  is  in  the 
power  cycling  (switching)  regime  during  the  aging  pro¬ 
cess.  These  are  taken  approximately  400mS.  apart.  The 
sampling  frequency  of  these  measurements  allows  for 
the  complete  observation  of  the  pulse  which  has  a  dura¬ 
tion  of  0.4mS.  There  are  ~3000  transient  measurements 
available  for  every  35  minutes  of  aging.  These  measure¬ 
ments  are  used  to  compute  Rds{ou )  and  could  be  used  to 
compute  ringing  characteristics  of  the  turn  ON  and  turn 
OFF  transients  as  well. 
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Figure  2:  Transient  response  to  a  square  control  signal  at 
the  gate. 

Temperatures  were  controlled  within  a  low  and  high 
temperature  range.  The  device  was  set  to  the  power 
cycling  (switching  mode)  regime  if  the  case  tempera¬ 
ture  fait  below  the  lower  threshold  Tmin  and  it  was 
turned  completely  off  if  the  temperature  reached  the  up¬ 
per  threshold  Tma:r .  This  hysteresis  controller  provided 
the  thermal  cycles  needed  to  accelerate  the  aging  of  the 
device.  This  is  similar  to  a  hysteresis  controller  as  show 
in  Figure  3.  During  power  cycling  (switching  mode), 
the  square  wave  control  signal  described  above  is  used 
to  switch  the  device.  This  generates  a  large  amount  of 
power,  resulting  in  rapid  excessive  heating  of  the  device 
in  the  absence  of  a  heat  sink.  It  should  be  noted  that 
proper  currents  and  voltages  were  maintained  within  the 
safe  operating  area  while  the  temperature  was  raised  be¬ 
yond  the  maximum  rating  to  induce  damage. 

In  addition  to  the  transient  measurements  described 


3 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


Power 
Cycle  30 


Off  0 


A 


A 


A 

A 

<  <  < 

7^ 

TJX) 


»| 

\f 

\f 

\f 


\f 


Figure  3:  Thermal  overstress  aging  control. 


above,  there  are  additional  measurements  at  a  lower  sam¬ 
pling  speed  which  are  used  to  control  the  aging  experi¬ 
ments.  A  snapshot  of  these  measurements  are  presented 
in  Figure  4.  In  this  figure,  only  Id  and  Tc  are  plotted 
along  the  logic  state  of  the  device  ( gate  state).  When  the 
device  is  in  the  power  cycling  regime  (gate  state  is  1), 
the  current  flows  through  the  drain  and  the  temperature 
starts  to  rise.  On  the  other  hand,  temperature  decreases 
when  the  device  is  not  switching.  These  measurement 
are  single  value  measurements  with  a  sampling  time  of 
~400mS.  Their  main  objective  is  to  provide  monitor  con¬ 
trol  variables  for  the  experiment,  provide  a  visual  online 
assessment  of  the  aging  process  and  monitor  different 
temperatures.  In  addition  to  Id,  other  voltages  like  Vdd 
and  Vds  are  monitored.  These  are  not  used  to  compute 
R DS(on)  given  that  they  do  not  have  enough  resolution 
as  to  make  sure  that  the  measurement  was  taken  during 
the  ON  state. 


Figure  4:  Aging  control  measurements. 


tronics  reliability.  The  objective  of  accelerated  life  test¬ 
ing  in  electronics  reliability  is  to  run  the  devices  to  fail¬ 
ure  in  a  controlled  fashion  and  in  a  period  of  time  con¬ 
siderably  smaller  than  the  intended  life  of  the  device  in 
real  life  operation.  Thermal  cycling  is  widely  used  to 
trigger  failure  mechanisms  related  to  the  packaging  of 
the  device  or  to  stress  bi-metal  assemblies  typical  in  flip- 
chip  designs  and  power  transistors  in  TO-220  packages 
where  the  copper  case  of  the  packaging  serves  as  the 
drain/collector  pin,  providing  electrical  and  thermal  dis¬ 
sipation  capabilities.  In  general,  a  reliability  study  would 
be  concerned  with  the  time  to  failure  under  an  acceler¬ 
ated  life  test,  disregarding  the  progression  of  the  degra¬ 
dation  process  and  the  progression  of  an  incipient  fault 
into  a  failure.  A  physics  of  failure  reliability  study  would 
apply  thermal  cycling  in  an  environmental  chamber  (no 
electrical  power  applied)  to  several  devices  and  record 
their  corresponding  times  to  failure.  These  data  are 
then  used  to  fit  empirical  models  based  on  physics  like 
a  power  law,  Boltzmann- Arrhenius,  or  Coffin-Manson, 
depending  on  the  failure  mechanism.  These  models  are 
then  used  to  predict  reliability  in  terms  of  mean  time  to 
failure  or  other  measures  like  time  between  failure  or 
availability. 

The  accelerated  aging  methodologies  used  in  relia¬ 
bility  serve  as  a  good  starting  point  for  generating  run 
to  failure  data  for  prognostics  algorithm  development 
(data-driven  of  physics-based).  These  methodologies 
should  be  enhanced  to  include  measurements  of  key  vari¬ 
able  throughout  the  test  in  order  to  assess  the  health  of 
the  system.  This  work  makes  use  of  the  thermal  cycling 
aging  methodology  but  uses  electrical  power  to  heat  the 
device.  This  results  in  thermal  cycles  as  well  and  it  is 
closer  to  the  way  these  devices  are  used  in  fielded  appli¬ 
cations. 

3.1  Thermal  stresses  due  to  thermal  cycling 

The  power  MOSFET  under  consideration  can  be  re¬ 
garded  as  a  bi-metal  assembly  in  flip  chip  design  as 
shown  in  Figure  5.  The  substrate  of  the  assembly  is  con¬ 
sidered  to  be  the  copper  plate  of  the  device,  the  chip  is 
the  bare  die  and  it  is  attached  to  the  substrate  by  lead- 
free  solder  (die-attach).  This  is  a  thermally  mismatched 
assembly  due  to  the  great  difference  in  the  linear  coeffi¬ 
cient  of  thermal  expansion  (CTE)  of  the  materials.  The 
CTE  for  copper  is  16-18,  for  silicon  is  2. 6-3. 3  and  for 
lead-free  solder  20-22.9  ppm/° C. 
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3.  DIE-ATTACH  DAMAGE  AS  FAILURE 
MECHANISM 

As  described  earlier  in  section  1.1,  thermal  cycling  is 
typically  used  for  accelerated  aging  in  the  field  of  elec¬ 


Figure  5:  Bi-metal  assembly  representation  of  power 
MOSFET  under  thermo-mechanical  loading. 

This  mismatch  in  CTE  generates  stresses  on  the  sol¬ 
der.  Consider  the  manufacturing  process  at  high  temper- 
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ature  ~100°C  at  which  the  different  layers  are  assem¬ 
bled.  When  the  device  is  let  to  cool  down  to  room  tem¬ 
peratures,  the  solder  material  will  be  in  tension  because 
it  is  trying  to  shrink  and  the  silicon  is  not  allowing  that 
to  happen.  A  similar  situation  arises  during  thermal  cy¬ 
cling. 

Suhir  (1986)  developed  a  model  of  the  thermo¬ 
mechanical  stresses  in  bi-metal  assemblies.  It  was  based 
on  the  theory  of  elasticity  and  compliance  of  the  mate¬ 
rials. This  model  identifies  shearing  stresses  on  the  inter¬ 
faces  of  the  solder  as  well  as  normal  stresses  perpendic¬ 
ular  to  the  interface.  There  is  a  high  stress  concentra¬ 
tion  at  the  ends  of  the  assemblies  resulting  in  the  for¬ 
mation  of  cracks  and  voids  in  the  solder  material.  This 
models  provides  the  theoretical  foundation  for  the  die- 
attach  failure  mechanism  as  a  result  of  thermal  cycling. 
It  demonstrates  that  mechanical  stresses  in  the  interfaces 
are  a  function  of  the  temperature  differentials  (AT)  ap¬ 
plied  through  the  thermal  cycling  process.  These  stresses 
give  way  to  crack  initiation  in  the  die-attach  which  then 
grow  as  a  function  of  continuing  thermal  cycling. 

3.2  Assessment  of  die-attach  health  through 
thermal  measurements 

The  die-attach  damage  due  to  thermal  cycling  results  in  a 
reduction  of  the  area  of  contact  at  the  solder-copper  and 
solder-silicon  interfaces.  The  heat  transfer  characteris¬ 
tics  due  to  thermal  conduction  in  the  die-attach  degrade 
given  the  decrease  in  the  area  of  contact.  Heat  flow  is  re¬ 
duced  on  a  degraded  die-attach  compared  with  a  pristine 
die-attach. 

The  steady-state  model  of  the  thermal  conduction 
from  junction  to  case  is  presented  in  the  following  equa¬ 
tion. 


where  0jc  is  the  junction-to-case  thermal  impedance  in 
°C/W  and  P  is  the  electrical  power  dissipated. 

As  a  result  of  die-attach  damage,  it  is  expected  that  0jC 
increases.  This  will  result  on  a  higher  Tj  for  the  opera¬ 
tion  of  a  degraded  device  assuming  that  the  power  dis¬ 
sipation  and  the  ambient  temperature  remain  fixed.  This 
is  a  high  level  approximation  of  heat  transfer  character¬ 
istics  of  the  device.  The  thermal  resistance  can  be  mea¬ 
sured  experimentally  with  the  use  of  specialized  equip¬ 
ment  and  it  provides  an  indication  of  the  severity  of  the 
degradation  in  the  die-attach. 

Die-attach  degradation  can  also  be  assessed  by  mea¬ 
suring  heating  curves  of  the  junction  temperature.  The 
military  standard  MIL-750  method  3161  presents  a 
methodology  for  thermal  impedance  measurements  of 
vertical  power  MOSFETs  using  the  delta  source-drain 
voltage  method.  This  methodology  takes  advantage  of 
the  body  diode  present  in  these  devices.  The  voltage  drop 
in  the  diode  is  proportional  to  the  junction  temperature 
and  it  is  a  very  accurate  way  to  measure  Tj .  The  device 
is  heated  by  operating  it  in  the  on  state  biasing  the  gate. 
The  Id  is  modulated  with  the  gate  voltage  to  provide  the 
required  power  (Watts)  to  heat  the  device.  The  device 
is  heated  for  a  fixed  amount  of  time  followed  by  a  tem¬ 
perature  measurement.  In  order  to  make  the  temperature 
measurement,  the  body  diode  is  biased  while  keeping  the 
gate  voltage  to  ground  to  ensure  a  channel  is  not  formed. 


A  small  current  Isd  is  used  to  measure  the  voltage  drop 
on  the  body  diode.  This  voltage  is  proportional  to  Tj. 

The  heating  curve  provides  an  assessment  of  the  ther¬ 
mal  characteristics  of  the  die-attach.  Figure  6  shows  the 
heating  curves  for  a  pristine  device  and  for  an  aged  de¬ 
vice  using  the  aging  procedure  described  in  the  follow¬ 
ing  section.  The  heating  time  is  one  second  and  it  can 
be  clearly  observed,  that  the  diminished  thermal  dissipa¬ 
tion  capabilities  of  the  aged  devices,  result  in  a  rapid  in¬ 
crease  of  the  junction  temperature.  The  steep  slope  start¬ 
ing  ~10mS.  is  indicative  of  the  thermal  performance  of 
the  die-attach. 


Figure  6:  Die-attach  thermal  performance  assessment 
(aged  =  device  #11). 


3.3  Failure  analysis  of  aged  devices 

Failure  analysis  techniques  like  Scanning  Acoustic  Mi¬ 
croscopy  (SAM)  and  X-rays  have  been  used  to  assess  the 
state  of  the  die-attach  for  power  MOSFETs  and  IGBTs 
aged  with  the  previously  described  aging  system  (Celaya 
et  al.,  2009;  Ginart  et  al.,  2008;  Patil  et  al.,  2009). 

Figure  7  is  an  X-ray  image  of  a  new  device.  The 
dark  area  represents  the  die-attach  solder  — the  silicon  is 
transparent  in  an  X-ray  image — ,  the  rectangular  shape 
represents  the  solder  below  the  die.  It  can  be  seen  that 
there  is  solder  below  the  die  covering  all  the  die  area; 
therefore,  the  area  of  contact  for  heat  transfer  by  conduc¬ 
tion  from  the  silicon  to  the  copper  is  the  same  as  the  area 
of  the  die.  Figure  8  is  an  X-ray  image  of  device  number 
8  after  aging.  The  aging  procedure  is  described  in  table  1 
in  section  4.1.  The  shadows  on  the  bottom  — region  rep¬ 
resented  with  dotted  red  line —  represent  solder  material 
that  has  migrated  as  a  result  of  the  thermal  stresses  and 
the  increase  of  internal  temperatures  beyond  the  melting 
point  of  the  solder.  Voids  are  also  observed  below  the  die 
area.  As  a  result,  the  area  of  contact  for  heat  conduction 
has  decreased  resulting  in  a  decrease  in  the  thermal  dis¬ 
sipation  performance  of  the  device.  This  result  is  consis¬ 
tent  with  the  observed  thermal  dissipation  performance 
presented  in  section  3.2.  Similar  results  are  obtained  for 
the  remaining  aged  devices  used  for  this  work. 

The  theory  of  thermo-mechanical  stresses  along  with 
the  failure  analysis  and  the  thermal  assessment  of  die- 
attach  performance  indicate  that  the  proposed  aging 
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methodology  generates  die-attach  degradation  as  a  fail¬ 
ure  mechanism. 


Figure  7:  X-ray  image  of  new  device. 
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Figure  8:  X-ray  image  after  aging  (device  #8). 


4.  DRAIN  TO  SOURCE  ON  RESISTANCE  AS  A 
PRECURSOR  OF  FAILURE 

The  relationship  between  Rds(oti)  and  Tj  for  a  power 
MOSFET  has  been  well  documented  by  Baliga  (2008) 
and  it  has  been  described  in  the  context  of  accelerated 
aging  for  PHM  in  (Celaya  et  al.,  2009).  As  described 
in  previous  sections,  Tj  increases  as  result  of  die-attach 
degradation  for  a  fixed  ambient  temperature  or  for  a  fixed 
Tc.  In  a  power  MOSFET,  Rds(ou)  increases  with  tem¬ 
perature  due  to  several  factors  like  the  reduction  of  mo¬ 
bility  in  the  drift  region  and  the  inversion  layer,  as  well 
as  a  reduction  in  the  threshold  voltage  Vcs{th )  (Baliga, 
2008).  As  a  result,  the  power  MOSFET  has  a  positive 
temperature  coefficient  for  on-resistance  which  results  in 
an  increase  in  power  losses  when  operated  at  high  tem¬ 
peratures  (Baliga,  2008). 

The  mobility  in  the  drift  region  decreases  proportional 
to  T~2A2  and  the  mobility  of  the  inversion  layer  de¬ 
creases  proportional  to  T-1  (Baliga,  2008).  As  a  result, 


Rns(on)  is  expected  to  increase  quadratically  with  re¬ 
spect  to  Tj.  This  agrees  with  the  experimental  results 
presented  in  the  following  section.  Even  though  Tc  is 
used  to  make  the  assessment,  a  quadratic  change  in  the 
resistance  is  observed  as  the  temperature  increases  dur¬ 
ing  the  aging  experiments. 

The  selection  of  Rds(ou)  as  a  potential  precursor  of 
failure  for  die-attach  degradation  process  comes  natural 
as  a  result  of  the  solid-state  physics  of  the  power  MOS¬ 
FET  structure.  This  also  applies  to  a  packaged  device 
like  the  devices  under  consideration  in  this  work  which 
have  a  TO-220  package.  The  arguments  in  support  of 
this  variable  as  precursor  of  failure  are  the  following; 

•  destructive  testing  to  assess  health  of  the  die-attach 
is  not  an  option  for  a  PHM  application; 

•  even  though  there  are  ways  to  assess  the  health  of 
the  die-attach  by  assessing  thermal  performance, 
the  required  equipment  might  not  be  used  for  in- situ 
assessment; 

•  the  on-resistance  provides  a  window  into  the  degra¬ 
dation  process  making  a  case  for  physics  based 
models  of  the  die-attach  degradation  process  in 
which  the  state  of  the  die-attach  could  be  observ¬ 
able  by  response  variables  like  Rds(ou)  ; 

•  it  is  relatively  easy  to  measure  Rds(oti)  in  s^tu  and 
there  are  discrete  devices  on  the  market  that  provide 
sensing  capabilities  for  Ip . 

4.1  Experimental  Results 

As  described  above,  the  aging  methodology  results  in 
thermal  overstress  as  a  result  of  thermal  cycling  and  in¬ 
ternal  temperatures  beyond  the  normal  operating  range 
of  the  devices.  In  the  aging  system,  there  is  no  ac¬ 
cess  to  junction  temperature  ( Tj )  measurements.  There¬ 
fore,  measurements  of  package  temperature  (Tp)  and 
case  temperature  (Tc)  are  used  instead.  While  Tc  and  Tp 
are  directly  affected  by  variations  in  Tj  resulting  from 
power  switching,  the  relationship  between  the  two  is 
rather  complex.  This  is  due  to  several  contributing  fac¬ 
tors  as  mentioned  next.  The  thermal  impedance  between 
the  die  and  the  case  consists  of  several  layers  of  differ¬ 
ent  materials,  i.e.  die  (silicon),  die-attach  (lead-free  sol¬ 
der),  case  (copper),  and  the  package  (epoxy).  This  ther¬ 
mal  impedance  has  capacitive  and  resistive  components 
that  further  vary  independently  in  all  the  layers.  Further¬ 
more,  an  accurate  in  situ  assessment  of  Tj  is  restricted 
by  the  high  switching  frequency  in  our  experiments  that 
is  much  faster  than  the  dynamics  of  temperature  for  it  to 
get  reflected  on  the  outside.  This  causes  a  delay  between 
measured  Rds(oti)  and  the  measured  temperatures.  The 
effects  of  such  delays  can  be  clearly  seen  from  the  mea¬ 
surements  as  shown  in  Figure  9.  This  figure  shows  how 
the  temperature  measured  on  the  package  (blue  on  upper 
chart)  lags  the  changes  in  Rds(oti )>  whereas  the  temper¬ 
ature  measured  on  the  copper  case  (green  in  lower  chart) 
shows  a  much  better  correspondence  to  the  computed 
Rds(ou )•  Therefore,  in  the  rest  of  the  study  all  analy¬ 
sis  was  done  using  the  Tc. 

As  a  device  undergoes  thermal  cycling,  the  internal 
structure  of  the  device  undergoes  mechanical  stresses 
at  the  interfaces  due  to  different  thermal  expansion  and 
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contraction  rates  of  the  layer  materials.  Persistent  cy¬ 
cling  under  such  stresses  is  expected  to  create  imperfec¬ 
tions  or  even  cracks  in  the  die-attach  layer.  However,  in 
our  first  aging  methodologies  (Celaya  et  al.,  2009;  Patil 
et  al.,  2009;  Sonnenfeld  et  al.,  2008)  the  devices  were  op¬ 
erated  at  very  high  temperatures,  estimated  to  be  beyond 
the  melting  point  of  die-attach  material.  In  some  cases,  it 
was  observed  that  molten  material  had  cracked  apart  the 
package  and  appeared  on  the  surface  ultimately  leading 
to  device  failure.  In  results  presented  here,  the  tests  were 
carefully  designed  to  avoid  such  situations  and  to  age 
multiple  devices  in  a  controlled  and  repeatable  manner. 
This  meant  aging  at  lower  temperatures  and  following  a 
step  load  temperature  profile  as  indicated  in  Table  1 .  The 
temperature  levels  were  adjusted  to  keep  a  maximum  of 
Tc  =  250°  C  within  the  first  aging  cycle  and  successively 
reduce  it  by  10°C  in  further  cycles  until  device  failed. 
Five  devices  were  aged  under  this  procedure. 


Figure  9:  Relationship  of  package  and  case  temperatures 
to  on-resistance. 


Table  1:  Aging  regime  (all  temperatures  in  °C) 


Aging 

run 

Target 

Tc 

T 

- L  min 

T 

- L  max 

Aging  time 

(min) 

1 

250 

249 

250 

35 

2 

240 

239 

240 

35 

3 

230 

229 

230 

35 

4 

220 

219 

220 

35 

5 

210 

209 

210 

240 

6 

210 

209 

210 

180 

7 

210 

209 

210 

180 

Our  hypothesis  behind  these  tests  was  that  if  ther¬ 
mal  cycling  causes  the  die-attach  degradation  its  ef¬ 
fect  should  be  reflected  on  the  on-resistance  computed 
through  Vds  and  Ijj  measurements.  A  limitation  in  es¬ 
tablishing  Rds(ou)  as  a  precursor  of  die-attach  degrada¬ 
tion  was  its  dependence  on  junction  temperature,  which 
could  not  be  easily  isolated  from  our  experiments  at  this 
time.  Therefore,  we  attempted  to  learn  the  Rds(ou)  de¬ 
pendence  on  temperature  using  data  from  5  repeated  ex¬ 
periments  conducted  under  similar  environmental  condi¬ 
tions  and  aging  regimes.  Figure  10  shows  the  relation¬ 
ship  between  the  measured  on-resistance  and  the  tem¬ 
perature  in  the  initial  aging  runs,  where  the  device  is 


expected  to  be  in  a  pristine  and  aged  conditions  respec¬ 
tively. 


RDS(ON)  vs‘  TemPerature:  Device  9 


Figure  10:  Rds(ou)  vs.  Tc  for  the  first  4  aging  runs. 

This  device  lasted  the  seven  aging  cycles  spanning 
over  13  hours.  During  the  first  four  aging  cycles  we 
see  negligible  changes  in  the  relationship  between  on- 
resistance  and  the  temperature.  However  as  shown  in 
Figure  11,  in  the  next  three  successive  tests  this  rela¬ 
tionship  changes  drastically.  We  attribute  these  changes 
to  the  die-attach  degradation,  which  was  confirmed  in 
the  previous  section.  As  described  in  previous  section, 
Rds(oti)  has  a  quadratic  relationship  with  the  temper¬ 
ature  and  it  can  be  observed  from  the  previous  figures 
even  though  they  show  Tc  instead  of  Tj .  The  migration 
of  these  curves  as  a  function  of  aging  time  could  be  used 
as  a  mechanism  for  damage  detection  and  for  diagnos¬ 
tics  of  the  die-attach  failure  mechanism.  It  can  also  be 
used  to  assess  remaining  life  of  the  device,  considering 
for  example,  the  red  curve  from  aging  run  7  (Figure  11) 
as  failure  threshold. 

R[]6(ON)  VS-  TemPerature:  Device  9 
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Figure  11:  Rds(ou)  vs.  Tc  for  the  all  aging  runs. 
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Increase  in  RDS(0N)  due  to  Device  Degradation 


Figure  12:  Rds(ou)  vs.  aging  time  f°r  all  devices. 


Figure  12  shows  Rds(oti)  as  a  function  of  aging  time 
for  all  the  devices.  Rds{ou)  Is  normalized  based  on  the 
values  measured  at  pristine  condition.  As  a  result,  this 
plot  shows  how  Rds(ou)  increases  as  aging  progresses 
and  damage  grows  in  the  die-attach  region.  It  can  also  be 
observed  that  failure  could  be  defined  before  the  seventh 
aging  run.  For  example,  on  device  14,  there  are  high  val¬ 
ues  of  Rds(oti)  in  the  neighborhood  of  the  300t/l  minute 
of  aging.  For  all  those  high  values,  the  device  looses  gate 
control  and  cannot  turn  on.  This  is  considered  a  fail¬ 
ure.  Further  investigation  is  required  to  define  the  failure 
criteria  under  the  die-attach  failure  mechanism.  Qualifi¬ 
cation  standards  for  reliability  provide  specifications  for 
the  amount  of  deviation  of  a  device  parameter  when  it  is 
considered  as  failed.  There  is  a  risk  involved  in  declaring 
the  failure  too  late.  Internal  temperature  keeps  rising  as  a 
result  of  die-attach  damage,  eventually,  the  high  temper¬ 
ature  will  cause  other  structures  — like  the  gate  oxide — 
to  fail.  This  will  in  place  make  it  more  challenging  to 
predict  remaining  life  if  multiple  failure  mechanisms  are 
involved  in  the  process  or  experiment. 

Figure  13  shows  the  increase  on  Rds(oti)  f°r  each  of 
the  7  aging  runs.  The  y-axis  represents  the  increase  in 
Rds(oti)  after  the  aging  run  from  the  value  at  the  start 
of  the  run.  The  results  are  presented  in  box  and  whisker 
plots  to  aggregate  results  for  the  5  devices  under  test.  For 
example,  the  sixth  box  summarizes  the  increase  for  the 
sixth  aging  run  (described  in  Table  1  for  the  5  samples 
(5  aged  devices).  The  red  line  on  the  box  represent  the 
median  and  the  blue  lines  in  the  box  represent  the  first 
and  third  quartiles.  They  are  used  here  to  estimate  loca¬ 
tion  and  spread  of  the  increase  on  Rds(ou)  for  each  of 
the  aging  runs.  It  should  be  noted  that  aging  parameters 
are  fixed  for  each  aging  run,  in  terms  of  aging  time,  load, 
Vqs  and  Vnn-  Small  increments  can  be  observed  from 
the  first  4  aging  runs  (35  minutes  runs).  A  big  shift  in 
the  parameter  is  observed  in  the  fifth  run  which  lasted 
for  4  hours.  In  general,  the  increase  in  Rds(oti)  is  larger 
for  aging  runs  of  larger  duration,  this  is  expected  since 
aging  time  is  proportional  to  the  number  of  thermal  cy¬ 
cles.  The  larger  the  number  of  thermal  cycles,  the  larger 


the  damage  to  the  die-attach  and  the  higher  the  values  of 

RdS(oti)  • 


1  2  3  4  5  6  7 


Figure  13:  Changes  in  on-resistance  for  each  aging  run. 


5.  CONCLUSION 

A  methodology  for  accelerated  aging  of  a  commer¬ 
cial  power  MOSFET  (IRF520Npbf)  in  a  TO-220  pack¬ 
age  is  presented.  This  methodology  based  on  thermal 
and  power  cycling  triggers  die-attach  failure  mechanism, 
which  is  a  common  failure  mechanism  for  discrete  de¬ 
vices  where  the  chip  is  attached  to  the  case  of  the  pack¬ 
age  by  lead- free  solder.  Experiments  with  X-ray  imaging 
and  thermal  performance  assessment  of  the  structure  cor¬ 
roborate  the  known  theory  that  thermal  cycling  results  in 
die-attach  degradation  for  flip-chip  type  of  structures. 

In  addition,  on-resistance  ( Rds(oti ))  has  been  iden¬ 
tified  as  a  precursor  of  failure  for  the  die-attach  fail¬ 
ure  mechanism.  Its  dependence  on  junction  tempera¬ 
ture  provides  a  window  to  the  degradation  process  and 
it  could  be  used  for  data-driven  prognostics  algorithms 
or  for  the  development  of  physics-based  models  to  be 
used  for  prognostics  on  a  Bayesian  update  framework. 
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The  contributions  of  this  work  are  manifold.  It  pro¬ 
vides  a  consistent  aging  methodology  for  accelerated  ag¬ 
ing  of  devices  for  PHM  development.  This  methodol¬ 
ogy  is  based  on  power  cycling  which  is  closer  to  real 
life  application  than  standard  thermal  cycling  performed 
on  an  environmental  chamber  whith  no  electricity  run¬ 
ning  through  the  device.  The  identification  of  Rds(ou) 
as  a  precursor  of  failure  variable  and  the  methodology 
to  normalize  Rds(ou)  with  respect  to  temperature  of  the 
device  is  very  important.  Work  done  in  the  past  consid¬ 
ered  this  variable  as  a  precursor  of  failure  but  the  mea¬ 
sured  values  throughout  the  aging  process  were  highly 
influenced  by  the  environment  temperature,  hence  mak¬ 
ing  difficult  the  isolated  assessment  of  the  degradation. 
Rds(ou)  as  presented  here,  is  proportional  to  the  dam¬ 
age  magnitude  of  the  device,  it  could  serve  as  a  feature 
for  fault  detection  and  diagnosis.  Furthermore,  it  could 
be  used  to  develop  remaining  useful  life  prediction  algo¬ 
rithms.  This  is  possible  because  Rds(ou)  is  computed 
from  in  situ  measurements  in  the  aging  system  and  val¬ 
ues  of  Rds(ou)  can  he  computed  thought  the  aging  pro¬ 
cess.  The  data  obtained  from  these  experiments  and  pre¬ 
sented  in  this  work,  represent  run  to  failure  data  for  five 
devices  aged  under  the  same  varying  operational  condi¬ 
tions.  It  contains  in  situ  high  speed  measurements  of  key 
physical  variables  that  allow  for  the  characterization  of 
the  transient  response  and  hence  the  computation  of  key 
static  and  dynamic  parameters  of  the  devices. 

Future  work  in  this  area  will  consist  on  the  develop¬ 
ment  of  data-driven  prognostics  algorithms  based  on  this 
dataset.  In  addition,  efforts  to  model  the  degradation  pro¬ 
cess  based  on  the  physics  are  taking  place.  This  data  set 
will  be  made  public  in  order  to  provide  the  PHM  research 
community  with  run  to  failure  data  for  the  development 
of  prognostics  algorithms  of  power  MOSFETs. 


NOMENCLATURE 


RDS(on) 

Tj 

Drain  to  source  on- state  resistance 

Junction  temperature  (°C) 

Tc 

Case  temperature  (Copper  plate)  (°C) 

Vds 

Drain  to  source  voltage 

vdd 

Power  supply  voltage  applied  to  the 
Drain-source  circuit  (rail  voltage) 

Ojc 

Junction  to  case  thermal  resistance  in 

°C/W 

Id 

Drain  current  (A) 
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